Efficient Methods for Multi-label Classification

نویسندگان

  • Chonglin Sun
  • Chunting Zhou
  • Bo Jin
  • Francis C. M. Lau
چکیده

As a generalized form of multi-class classification, multilabel classification allows each sample to be associated with multiple labels. This task becomes challenging when the number of labels bulks up, which demands a high efficiency. Many approaches have been proposed to address this problem, among which one of the main ideas is to select a subset of labels which can approximately span the original label space, and training is performed only on the selected set of labels. However, these proposed sampling algorithms either require nondeterministic number of sampling trials or are time consuming. In this paper, we propose two label selection methods for multi-label classification (i) clustering based sampling (CBS) that uses deterministic number of sampling trials; and (ii) frequency based sampling (FBS) utilizing only label frequency statistics which makes it more efficient. Moreover, neither of these two algorithms needs to perform singular value decomposition (SVD) on label matrix which is used in previously mentioned approaches. Experiments are performed on several real world multi-label data sets with the number of labels ranging from hundreds to thousands, and it is shown that the proposed approaches achieve the state-of-the-art performance among label space reduction based multi-label classification algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

Efficient Multi-label Classification for Evolving Data Streams

Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. This paper proposes a new experimental framework for studying multi-label...

متن کامل

Streaming Multi-label Classification

This paper presents a new experimental framework for studying multi-label evolving stream classification, with efficient methods that combine the best practices in streaming scenarios with the best practices in multi-label classification. Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non strea...

متن کامل

A Mixtures-of-Experts Framework for Multi-Label Classification

We develop a novel probabilistic approach for multi-label classification that is based on the mixtures-of-experts architecture combined with recently introduced conditional tree-structured Bayesian networks. Our approach captures different input-output relations from multi-label data using the efficient tree-structured classifiers, while the mixtures-of-experts architecture aims to compensate f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015